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EVALUATION 


The  work  completed  under  Contract  F30602-76-C-0349  is  significant 
because  it  has  established  the  potential  of  correctly  classifying  military 
equipment  In  performing  real-time  radar  change  detection  exploitation. 

With  large  area  reconnaissance  radar  surveillance  systems,  methods  for 
automatically  classifying  or  cueing  interpreters  to  targets  is  necessary 
if  intelligence  data  is  to  be  extracted  from  these  systems  in  a timely 
manner.  The  Air  Force  has  developed  the  Modular  Change  Detector  (MCD)  for 
performing  real-time  change  detection.  This  contract  established  a means 
of  automatically  classifying  targets  which  are  displayed  on  the  output  of 
the  MCD  to  cue  interpreters  to  areas  of  interest.  The  work  accomplished 
under  this  effort  falls  directly  under  TP0-R2C,  Ground  Target  Detection 
and  Identification  whose  objective  is  to  improve  the  Air  Force's  real-time 
exploitation  capabilities. 

The  technique  developed  in  this  effort  is  currently  being  applied 


to  the  advanced  real-time  UPD-4  Side  Looking  Radar  exploitation  system. 


RONALD  B.  HAYNES 
Project  Engineer 
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1.0  OBJECTIVE 


Military  tactical  targets  are  often  composed  of  a number  of  individual 
objects.  For  example,  a mobile  missile  site  contains  radar  units,  support 
equipment  and  launchers.  These  objects  are  clustered  with  some  general 
characterization  such  as  the  spacing  between  objects,  the  number  of  objects 
and,  possibly,  the  pattern  of  deployment.  Similarly,  gun  emplacements, 
armor,  etc.  are  composed  of  clusters. 

The  objectives  of  this  study  were  to  1)  develop  a cluster  detection 
procedure,  2)  evaluate  the  procedure  with  typical  outputs  from  radar 
change  detection,  and  3)  evaluate  the  potential  for  improvements  in  system 
effectiveness;  either  more  area  processed  per  unit  time  or  better  target 
discrimination. 

1 . 1 SYSTEM  CONFIGURATION 

The  relationship  of  the  cluster  detector  to  the  radar  exploitation 
system  is  shown  in  Figure  1-1.  Changes  detected  by  a radar  change  detection* 
system  are  the  inputs.  The  cluster  detector  determines  which  changes  are 
associated  with  clusters.  The  output  format  is  under  operator  control.  It 
could  be  all  the  changes  with  the  clusters  indicated  by  graphics,  only  the 
clusters  or  only  clusters  in  designated  target  classes  of  interest. 

The  subsequent  processing  is  the  normal  exploitation  management 
and  target  reporting. 

If  the  change  detection  is  considered  as  a target  screener,  then  the 
cluster  detector  adds  another  stage  of  filtering  or  discrimination.  Rather 
than  show  all  changes  to  the  analyst,  it  eicher  "flags"  or  passes  only  those 
t with  higher  probability  of  being  targets. 

* Radar  change  detection  is  a procedure  for  detecting  objects  in  radar 
imagery  which  have  moved  between  two  coverages  of  an  area.  The  radar 
return  from  the  object  remains  in  the  difference  image. 
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Figure  L- 1 Relation  of  Cluster  Detector  to  Radar  Exploitation 
System. 
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1.2  PROCEDURE  DESCRIPTION 


There  are  two  primary  steps  in  the  procedure;  clustering  and  classifi- 
cation. Those  ar  illustrated  in  Figure  1-2.  The  clusterer  searches  for 
all  clusters  which  satisfy  some  general  parameters.  Typical  parameters 
are  spacings  between  objects,  size  of  cluster  and  number  of  objects.  These 
parameters  should  include  all  targets  of  interest.  Since  the  target  cate- 
gories are  time  and  situation-dependent,  the  parameter  values  are  under 
manual  control. 

The  cluster  classifier  discriminates  specific  target  classes  of  inter- 
est. This  feature  is  useful  in  searching  for  specific  targets  of  interest. 
Two  further  benefits  of  the  classifier  are  1)  assignment  of  a confidence 
measure  on  the  probability  the  cluster  belongs  to  a target  class  and  2) 
computation  of  target  features.  These  features  could  be  used  by  the  analyst 
in  target  identification  or  might  be  useful  in  cataloging  the  targets  for 
subsequent  review  or  selection  from  a data  base.  These  benefits  were  not 
evaluated  since  they  were  beyond  the  scope  of  the  effort. 


1.3  RESULTS 


i 


’•  * 


Three  experiments  were  performed  to  evaluate  the  procedure.  The  input 
data  were  imagery  samples  of  known  targets  provided  by  RADC.  Each  sample 
was  an  image  pair  from  two  successive  coverages  of  the  same  area.  Only 
those  cases  where  the  targets  moved  between  coverages  were  used.  This 
ensured  that  the  change  detection  procedures  would  give  realistic  outputs. 
That  is,  the  change  detection  parameters  were  set  to  detect  all  the  changes 
due  to  targets.  The  false  alarms  which  occurred  correspond  to  these  same 
parameters.  Thus  the  false  alarm  distribution  Is  consistent  with  the  target 
changes . 


The  imagery  data  were  processed  through  a software  change  detection 
system.  This  software  simulates  the  Modular  Change  Detector,  MCD,  developed 
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Figure  1.2  Basic  Elements  of  Cluster  Detector. 
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sentative  of  current  change  detection  technology. 

A series  of  three  experiments  was  performed,  each  representing  a further 
stage  of  development  of  the  procedure.  The  first  was  for  only  one  target 
class,  gun  emplacement  arrays.  The  initial  clusterer  correctly  grouped  all 
eight  of  the  gun  arrays  into  separate  clusters.  The  background  or,  false 
alarm,  changes  were  grouped  into  52  clusters.  The  classifier  then  correctly 
discriminated  the  eight  gun  arrays  as  target  while  calling  one  of  the  back- 
ground clusters  a target.  These  results  are  listed  in  Table  1-1. 

1.4  CONCLUSIONS  AND  RECOMMENDATIONS 

The  principal  result  is  that  an  effective  procedure  was  developed  to 
detect  clusters  of  radar  changes  corresponding  to  military  targets.  It 
appears  that  this  procedure  would  be  a useful  addition  to  radar  change 
detection.  Potential  advantages  are: 

1.  A reduction  in  the  number  of  changes  that  have  to  be  analyzed 
in  subsequent  processing.  This  can  increase  the  search  area 
and  target  completeness  of  an  analyst. 

2.  Automatic  computation  of  cluster  features  and  probabilities  of 
being  a target.  These  could  aid  the  analyst  in  target  discrimin- 
ation. Another  use  would  be  to  use  these  features  to  prioritize 
or  select  the  target  candidates  for  subsequent  analysis. 

It  is  recommended  that  the  following  steps  be  taken. 

1.  The  clustering  procedure  would  be  implemented  in  an  interactive 
laboratory  simulation.  Then  the  procedures  for  manual  selection 
of  cluster  and  target  parameters  could  be  defined  and  evaluated. 

* Air  Force  Contract  F30603-73-C-0141 
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2.  The  clustering  procedure  (with  manual  parameter  selection)  would 
be  configured  and  evaluated  on  an  on-line  implementation.  Outputs 
from  the  MCD  (or  alternate)  change  detection  would  be  used  to 
provide  a more  extensive  data  set.  These  data  would  be  further 
extended  with  synthetica 1 ly  generated  cluster  distributions . 

3.  An  operational  implementation  would  be  defined,  based  on  the  above 
two  steps. 

The  first  two  steps  could  be  conducted  on  CDC's  Cyber-Ikon  facility. 
The  third  step  would  be  based  on  the  hardware  configurations  available. 

The  Cyber-Ikon  uses  the  same  hardware  (Flexible  Processors)  as  the  MCD. 

Thus  it  would  be  relatively  easy  to  transfer  the  procedure  from  the 
laboratory  to  the  MCD. 


TABLE  1-1.  AUTOMATIC  CLASSIFICATION  OF  CUN  EMPLACEMENT  ARRAYS 


In  the  second  and  third  experiments,  the  targets  were  grouped  into  one 
class  and  backgrounds  into  the  other  class.  No  further  class  distinction 
was  made  since  there  were  not  enough  examples  to  be  statistically  significant. 
Use  of  the  initial  clustering  technique  produced  the  results  shown  in  Table 
1-2.  Subsequently  a more  powerful  technique  was  developed  resulting  in  the 
performance  shown  in  Table  1-3.  This  more  powerful  technique  showed  excel- 
lent discrimination. 
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TABLE  1-2.  AUTOMATIC  CLASSIFICATION  OF  GUN  EMPLACEMENT, 
HELICOPTER,  AND  BEACON  ARRAYS. 


CLASSIFICATION 

GROUND  TRimr— 

TARGET 

BACKGROUND 

TARGET 

17 

0 

BACKGROUND 

5 

93 

TABLE  1-3.  AUTOMATIC  CLASSIFICATION  OF  HELICOPTER, 
HAWK,  ARMOR,  AND  TANK  ARRAYS 


~CROUND~TRUTH  

TARGET 

BACKGROUND 

TARGET 

15 

0 

BACKGROUND 

2 

102 
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2.0  THEORETICAL  DESCRIPTION  OF  TECHNIQUES 

This  section  describes  the  automatic  target  pattern  detection  process. 

It  involves  four  steps.  First,  a difference  image  is  created  using  MCD  change 
detection  software.  Second,  change  events  are  extracted  from  the  difference 
image  and  clustered  into  arrays.  Third,  event  ground  truth  and  characteristic 
properties  of  the  arrays  are  used  to  train  the  automatic  classifier.  Finally, 
the  arrays  are  automatically  classified  and  compared  with  ground  truth  to 
establish  the  probability  of  detection  and  the  probability  of  false  alarms. 
These  procedures  are  described  in  the  following  paragraphs. 

2.1  STEP  1:  CONSTRUCTING  THE  DIFFERENCE  IMAGE 
* 

The  first  step  of  the  automatic  classification  process  is  the  construc- 
tion of  a difference  image.  This  is  obtained  from  the  SAR  data  base  as  shown 
in  Figure  2-1.  A dual  coverage  region  is  selected  from  the  SAR  data  base, 
yielding  a mission  image  and  a reference  image  pair.  The  mission  image  is 
then  registered  and  photo normalized  to  the  reference  image.  Finally,  the 
two  images  are  subtracted  to  obtain  the  difference  image. 

SAR  Data  Base 


The  SAR  data  base  for  this  study  consists  of  imagery  from  Gallant  Hand 
and  Reforger  '76.  Regions  of  dual  coverage  (reference  and  mission)  for  change 
detection  analysis  containing  gun  emplacements  (Gallant  Hand)  and  a variety  of 
groups  of  military  vehicles  (Reforger  '76)  are  selected  for  processing. 

Registration  and  Photonormalization 

The  mission  and  reference  images  are  registerd,  photonormalized,  and 
subtracted  to  produce  a difference  image.  The  first  step  in  this  procedure 
is  the  manual  selection  of  corresponding  control  points  in  the  two  images. 
These  control  points  are  used  to  evaluate  the  parameters  in  the  warp  equation, 
which  mathematically  relates  the  coordinates  of  an  event  in  the  two  images. 


m 
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Figure  2-1.  Step  1:  Constructing  the  Difference  Image 
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The  output  is  a warped  mission  image  which  achieves  point-by-point  registra- 
tion with  the  reference  image. 


After  registration,  the  gray  levels  (intensity)  of  the  warped  mission 
image  are  photonormalized  so  that  corresponding  points  on  each  image  have 
approximately  equal  intensity.  Then  each  pixel  of  the  warped  mission  image 
is  subtracted  from  the  corresponding  pixel  in  the  reference  image  to  produce 
a difference  image.  The  difference  image  suppresses  the  background  and 
provides  change  events. 

2.2  STEP  2:  CLUSTERING  THE  EVENTS  INTO  ARRAYS 

The  second  stage  of  the  automatic  classification  process  is  the  cluster-  j 

ing  of  change  events  into  arrays,  as  illustrated  in  Figure  2-2.  The  change 
image  pixels  are  thresholded  in  intensity  so  that  only  the  strong  return 

pixels  which  are  in  the  mission  image  but  not  in  the  reference  image  remain.  j 

Next,  contiguous  strong  return  pixels  in  the  thresholded  image  are  connected 

together  to  form  events.  These  events  are  thresholded  according  to  size  so 

that  only  the  large  events  remain.  These  large  events,  representing  large  < 

strong  returns  which  are  in  the  mission  image  but  not  in  the  reference  image, 

are  automatically  clustered  into  arrays.  Clustering  is  further  described 

later  in  this  section. 

Adaptive  Intensity  and  Size  Threshold  Selection 

i 

An  intensity  histogram  of  the  change  image  is  computed  as  illustrated  in  1 

Figure  2-3.  This  histogram  shows  the  number  of  pixels  which  have  a particu-  I 

lar  intensity  (gray  scale).  It  has  been  found  that  the  histogram  is  quali- 
tatively Gaussian  in  shape  except  for  low  intensities.  At  these  low 
intensities,  a departure  from  a Gaussian  shape  occurs.  The  intensity  at  the 
point  of  departure  from  Gaussian  shape  was  chosen  as  the  intensity  threshold. 

It  was  found  that  this  selection  procedure  using  the  intensity  threshold  for 
extracting  mission  strong  return  change  events,  together  with  an  event  size 
threshold  for  two  pixels,  successfully  extracted  target  events  and  suppressed 
noise  events  for  the  series  of  SAR  regions  processed. 
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INTENSITY  HISTOGRAM 


PIXELS 


INTENSITY 
(Gray  Scale) 


Figure  2-3.  Mission  Strong  Return  Event  Intensity 
Threshold  Selection  on  Change  Image 


The  intensity  histograms,  and  therefore  the  intensity  thresholds, 
differed  from  one  image  to  another  image.  Thus,  the  threshold  selection  can 
be  considered  to  be  adaptive.  This  method  was  successfully  used  in  the  SAR 
images  processed  during  the  later  period  of  the  contract. 

The  intensity  at  departure  from  Gaussian  shape  was  a subjective  judge- 
ment. However,  this  departure  intensity  could  be  automatically  determined 
by  a computer  analysis  of  the  histogram. 

Clustering  Methods 

At  this  point  in  the  clustering  step,  those  events  which  represented 
large  strong  returns  on  the  mission  image  only  (not  on  the  reference  image) 
have  been  filed  by  centroid  location  and  size.  Next,  these  events  were 
clustered  into  arrays. 

The  purpose  of  grouping  events  into  an  array  is  that  a group  of  events, 
located  close  together,  often  has  military  significance  that  can  not  be  deter- 
mined from  the  Individual  events.  For  example,  a target  could  be  composed 
of  events  forming  a distinctive  pattern.  It  is  desirable  that  the  clustering 
algorithm  cluster  target  events  without  also  introducing  into  the  array  a 
large  number  of  background  events. 

Two  different  clustering  programs  (Iterative  Clustering  and  Minimal 
Spanning  Tree  Clustering)  were  investigated  and  are  described  below. 

Iterative  Clustering  Method 

This  special-purpose  two-dimensional  clustering  procedure  was  developed 
for  this  contract.  It  is  discussed  here  in  two  parts:  the  analytically 
well-defined  portion,  which  makes  the  assignment  of  events  to  clusters;  and 
the  portion  which  reviews  the  result,  applies  subjective  completion  criteria, 
and  modifies  the  input  parameters  to  drive  the  process  towards  a completion 
state.  Quantitative  details  are  summarized  in  Appendix  A. 
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Cluster  Assignment 

The  first  part  of  the  Iterative  Clustering  Method  concerns  cluster 
assignment.  A linked  list  is  provided  to  keep  track  of  the  cluster  to 
which  each  event  is  assigned  and  to  locate  all  events  in  a given  cluster. 

The  assignment  process  begins  by  assigning  event  number  one  to  cluster 
number  one.  The  next  event  is  tested  to  determine  whether  it  should  be 
assigned  to  the  same  cluster  or  begin  a new  one.  In  every  case  the  cluster 
assignment  is  determined  by  evaluating  the  conditional  probability  density 
function,  (i.e.  the  likelihoods  of  each  existing  cluster  at  the  location  of 
the  event  to  be  assigned).  The  event  is  assigned  to  the  maximum  likelihood 
cluster  unless  it  is  determined  that  it  lies  out  on  the  tail  of  that 
cluster's  distribution.  In  these  cases  a new  cluster  is  created. 

Events  on  the  tail  of  a cluster's  distribution  are  identified  by  integrating 
the  distribution  from  its  mean  value  out  to  the  event  location.  Since  the 
listributions  are  normalized,  the  closer  the  result  of  the  integration 
approaches  one,  the  further  out  on  the  tail  the  event  lies.  The  decision 
to  begin  a new  cluster  is  made  whenever  the  integral  exceeds  a threshold 
value.  The  threshold  value  is  adjusted  as  required  after  each  assignment, 
iteration  (assignment  of  each  event  to  some  cluster)  is  completed. 

This  threshold  value  is  one  of  two  input  parameters  which  controls  the 
3tate  of  the  clusters.  The  other  is  the  size  of  the  parameter  which  deter- 
mines the  initial  clustering  assignment  and  limits  the  maximum  cluster 
size. 

A probability  density  function  is  associated  with  each  cluster.  When 
a cluster  is  created  it  contains  one  event.  Until  two  more  events  are  added, 
the  probability  density  function  is  assumed  to  be  a circularly  symmetric 
exponential.  The  initial  rate  of  exponential  decay  with  distance  from  the 
cluster  center  for  clusters  with  less  than  three  events  is  the  second  input 
parameter.  With  every  addition  of  an  event  to  a cluster,  the  exponential 
becomes  broader  because  the  kurtosis  of  the  distribution  function  selected 
increases  with  the  number  of  events  assigned  to  the  cluster. 
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A departure  from  circular  symmetry  is  introduced  by  computing  the 
two-dimensional  covariance  of  the  cluster  when  three  or  more  events  are 
assigned  to  the  same  cluster.  The  inverse  of  the  covariance  is  used  in 
place  of  the  input  parameter  causing  circular  symmetry.  It  creates  a pre- 
ferred direction,  permitting  events  lying  in  that  direction  to  be  more  easily 
drawn  into  the  cluster  than  equally  distant  events  which  are  in  a perpendicu- 
lar to  the  preferred  direction. 

At  the  transition  from  use  of  the  input  parameter  to  use  of  the  covar- 
iance, the  breadth  of  the  function  is  prevented  from  changing  rapidly  by 
requiring  an  equality  in  the  value  of  comparable  probabilities  for  the  2 
and  3 event  cluster  distributions. 

Cluster  Assessment 


The  second  part  of  the  iterative  clustering  method  concerns  cluster 
assessment.  The  clustering  process  begins  by  creating  a few  very  large 
clusters.  The  final  solution  obtained  by  any  clustering  process  cannot  be 
unique  unless  external  requirements  are  imposed  on  the  process.  The  process 
described  here  gives  results  which,  if  not  unique,  are  consistently 
reasonable  and  which  are  judged,  subject  to  experimental  confirmation,  to 
be  nearly  independent  of  event  order. 

The  external  requirement  imposed  to  drive  the  cluster  assignment 
process  through  another  iteration  is  maximum  permissible  cluster  size.  A 
judgement  is  required  about  the  scale  size  of  a cluster  that  would  be  caused 
by  the  largest  significant  military  activity  of  interest. 

The  clusters  obtained  after  every  pass  are  examined  to  identify  a 
permissible  solution.  If  the  solution  is  favorably  evaluated,  the  number  of 
clusters  obtained  is  fixed  and  the  solution  is  submitted  to  a stability 
test.  This  is  conducted  by  allowing  events  to  move  to  different  clusters 
but  not  allowing  the  number  of  clusters  to  change.  The  intent  is  to  find 
a state  of  the  solution  which  is  relaxed;  that  is,  each  event  lies  in  its 
maximum  likelihood  cluster.  If  the  solution  is  not  satisfactory,  process 
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parameters  are  adjusted  to  encourage  the  creation  of  new  clusters  until  the 
largest  cluster  satisfies  the  size  criterion  imposed. 


Minimal  Spanning  Tree  (MST)  Clustering  Method 

This  method  for  automatically  clustering  events  was  written  as  a 
computer  program  by  R.  L.  Page*  and  is  based  on  an  algorithm  by  C.  T. 

Zahn**.  The  method  involves  the  construction  of  the  minimal  spanning  tree 
graph  of  the  set  of  events. 

Advantages  of  the  method  are  that  it  requires  little  input  other  than 
the  event  locations,  it  is  relatively  insensitive  to  permutations  in  the 
order  that  event  locations  are  inputted,  and  the  clusters  (arrays  of  events) 
it  produces  are  similar  to  clusters  detected  visually  by  humans  when  the 
events  are  displayed  as  an  inage.  The  following  discussion  will  define, 
exemplify,  and  characterize  the  minimal  spanning  tree  clustering  method. 

An  edge-weighted  linear  graph  is  composed  of  a set  of  points  called 
nodes  (or  events)  and  a set  of  node  pairs  called  edges  with  a weight  (the 
distance  between  the  pair  of  events)  assigned  to  each  edge.  Figure  2-4  a 
depicts  a weighted  graph  with  six  nodes  and  nine  edges,  each  of  which  has 
a weight  equal  to  its  length.  A path  in  a graph  is  a sequence  of  edges 
joining  two  nodes,  as  (ABCFE)  or  (BADE).  A circui t is  a closed  path  as 
(ABCA)  or  (ACFEDA).  A connected  graph  has  paths  between  any  pair  of  nodes. 

A tree  is  a connected  graph  with  no  circuits,  and  a spanning  tree  of 
connected  graph  G is  a tree  in  G which  contains  all  nodes  of  G.  Figures  2-4  b 


V 


* R.  L.  Page,  "A  Minimal  Spanning  Tree  Clustering  Method",  Comm.  ACM,  17 
(1974),  321-323. 

**C.  T.  Zahn,  "Graph-Theoretical  Methods  for  Detecting  and  Describing 
Gestalt  Clusters,"  IEEE  Trans,  on  Computers,  C-20  (1971),  68-86, 
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Figure  2-4.  Cluster  Identification  with  Minimal  Spanning  Tree 
(a)  Weighted  Linear  Graph,  (b)  Spanning  Tree, 
(c)  Minimal  Spanning  Tree,  (d)  Two  Subtrees 
(Clusters)  Formed  After  Deleting  "Significantly" 
Longer  Edge  AD  from  Minimal  Spanning  Tree. 
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and  c represent  spanning  trees  of  the  graph  in  Figure  2-4  a.  If  the  weight 
of  a tree  is  defined  to  be  the  sum  of  the  weights  of  its  constituent  edges, 

then  a minimal  spanning  tree  (HST)  of  graph  G is  a spanning  tree  whose 

weight  is  minimum  among  all  spanning  trees  of  G.  Figure  2-4  c is  the  MST 
for  Figure  2-4  a. 

The  basic  idea  of  the  MST  clustering  method  is  to  detect  inherent 
separations  in  the  event  locations  by  deleting  edges  from  the  minimal 
spanning  tree  which  are  significantly  longer  than  nearby  edges.  Such  an 
edge  is  called  "inconsistent."  Zahn  suggests  the  following  criterion: 
an  edge  is  inconsistent  if  (1)  its  length  is  more  than  f times  the  average 

of  the  length  of  nearby  edges,  and  (2)  its  length  is  more  than  s standard 

deviations  larger  than  the  average  of  the  lengths  of  nearby  edges  (the 
numbers  f and  s may  be  adjusted  by  the  user).  The  question  of  determining 
which  edges  are  "nearby"  is  also  answered  by  the  user.  The  event  P is 
said  to  be  nearby  event  Q if  event  P is  connected  to  event  Q by  a path  in 
the  minimal  spanning  tree  containing  d or  fewer  edges  (d  is  an  integer 
specified  by  the  user).  Deleting  the  inconsistent  edges  breaks  up  the  tree 
into  several  connected  subtrees.  The  events  in  each  connected  subtree  are 
the  members  of  an  array.  Figure  2-4  d shows  the  two  subtrees  (clusters 
or  arrays)  formed  after  the  significantly  longer  edge  AD  was  deleted 
from  the  minimal  spanning  tree  shown  in  Figure  2-4  c. 

Thus  the  minimal  spanning  tree  clustering  program  inputs  the  large, 
strong  return  change  event  centroid  locations  as  nodes,  forms  the  minimal 
spanning  tree  of  the  nodes,  and  deletes  the  significantly  longer  edges 
of  the  minimal  spanning  tree,  thus  forming  groups  of  subtrees  of  nodes. 

The  change  events  from  either  the  reference  or  mission  image  are  grouped 
into  clusters  in  the  same  way.  In  general,  any  one  cluster  would  be  formed 
of  changes  detected  on  one  of  the  images  only. 

These  subtrees  are  the  event  arrays:  all  events  in  the  same  subtree 
are  assigned  to  the  same  array. 

All  clustering  done  with  the  minimal  spanning  tree  method  used  the 

following  values  of  the  parameters:  d“2,  f**1.3,  and  s=Q. 
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2.3  STEP  3:  TRAINING  THE  AUTOMATIC  CLASSIPIER 


t 

1 
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The  third  stage  of  the  automatic  classification  process  deals  with  the 
training  of  the  automatic  classifier;  as  illustrated  in  Figure  2-5.  At 
this  point  in  the  process,  the  events  have  been  clustered  into  arrays.  Each 
array  is  now  labeled  as  a target  or  background  on  the  basis  of  event  ground 
truth  information.  Next,  twenty  features  are  evaluated  for  each  of  the 
arrays.  Then,  from  the  target  arrays,  twenty  target  feature  means  and  the 
twenty-by-twenty  target  feature  covariance  matrix  are  calculated.  Also, 
from  the  background  arrays,  the  twenty  background  feature  means  and  the 
twenty-by-twenty  background  feature  covariance  matrix  are  calculated.  Then 
a cannonical  transformation  in  the  twenty-feature  space  is  performed  which 
determines  the  three  best  generalized  features  (linear  combinations  of  the 
original  twenty  features)  for  distinguishing  target  arrays  from  background 
arrays.  Finally,  class-conditional  probability  density  functions  for  target 
arrays  and  for  background  arrays  are  parameterized  in  this  three-dimensional 
generalized  feature  space. 

A more  detailed  description  of  the  automatic  classifier  training  process 
follows. 

Application  of  Event  Ground  Truth 

At  this  point  in  the  processing  the  events  have  been  clustered  into 
arrays.  Next,  each  array  is  labeled  as  a target  array  or  a background  array, 
using  the  ground  truth  Information  for  each  event.  A particular  array  is 
labeled  (a)  a "target  array"  if  the  percentage  of  target  events  in  the  array 
is  greater  than  a specified  threshold;  (b)  a "background  array"  if  the  per- 
centage of  background  events  is  greater  than  another  threshold;  and  (c) 
"neither",  if  neither  of  the  above  conditions  are  met.  In  all  of  the  process- 
ing, the  thresholds  In  (a)  and  (b)  were  both  set  at  527..  Thus,  if  an  array 
was  composed  of  more  than  527.  target  events,  the  array  was  labeled  a target 
array.  If  an  array  was  composed  of  more  than  527.  background  events,  the 
array  was  labeled  a background  array.  However,  if  the  array  was  composed  of, 
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TABLE  OF  TARGET  ARRAYS 
TABLE  OF  BACKGROUND  ARRAYS 


EVALUATE  20  FEATURE  MEANS  AND  COVARIANCE  MATRIX 


TABLE  OF  TARGET  MEANS  AND  COVARIANCE  MATRIX 
TABLE  OF  BACKGROUND  MEANS  AND  COVARIANCE  MATRIX 


Figure  2-5.  Training  the  Automatic  Classifier 
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for  example,  507,  target  events  and  507.  background  events,  the  array  was  a 
"neither"  array  and  was  eliminated  from  further  processing.  The  threshold  of 
50-50  is  probably  optimum  unless  a priori  information  is  available  as  to 
the  probability  of  a change  event  being  a target. 

The  fact  that  a target  array  is  not  totally  composed  of  target  events 
is  to  be  expected.  First,  noise  (background)  does  occur  among  the  target 
events.  Second,  noise  events  located  outside  of  a group  of  target  events 
might  be  included  in  the  array  by  the  clustering  routine,  which  uses  inter- 
event spacings  for  cluster  assignments. 

Evaluation  of  Twenty  Features 

In  order  to  automatically  classify  each  array  as  a target  array  or  a 
background  array,  a group  of  features  must  be  selected  which  differentiate 
between  the  two  types.  Features  of  individual  objects  are  the  size  and 
intensity  of  the  events  in  the  array.  Other  features  describe  the  geometric 
properties  of  the  events  as  a group  within  the  array.  The  twenty  array 
features  are  presented  in  Table  2-1.  A more  detailed  discussion  of  the 
twenty  features  appears  in  Appendix  B. 

Evaluation  of  Twenty-Feature  Means  and  Covariance  Matrices 

The  twenty  features  evaluated  for  a target  array  locate  a vector  X in 
the  twenty-dimensional  feature  space.  The  collection  of  vectors  obtained 
from  the  target  arrays  are  used  to  evaluate  the  twenty- feature  means  and 
covariance  matrix.  These  quantities  are  the  parameters  in  the  Gaussian- 
modeled  target  class-conditional  probability  density  function  f(X|T),  the 
probability  density  that  a target  array  has  features  X.  The  parameters 
of  the  background  class-conditional  probability  density  function  f(X|B) 
are  similarly  determined  from  the  features  evaluated  from  the  background 
arrays. 
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TABLE  2-1.  FEATURE  DEFINITION 


FEATURE 

DESCRIPTION 

1 

Number  of  events  in  the  array. 

2 

Spacing  of  events. 

3 

Regularity  of  spacing. 

4 

Event  size. 

5 

Uniformity  of  event  size. 

6 

Event  shape. 

7 

Uniformity  of  event  shape. 

8 

Uniformity  of  event  orientation. 

9 

Array  size. 

10 

Array  shape. 

11 

Orientation  of  events  to  the  array. 

12 

Event  area. 

13 

Array  mass  size. 

15 

Array  mass  shape. 

16 

Orientation  of  array  mass  to  array. 

17 

Distance  from  mass  to  geometric  centroids. 

18 

Array  pixel  density. 

19 

Event  pixel  density. 

20 

Uniformity  of  event  pixel  density. 

The  means  and  covariance  matrices  of  the  targets  and  background  are 
used  to  determine  the  best  class-separating  transformation  and  to  model 
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the  probability  density  functions  after  the  transformation  to  generalized 
features. 

Class -Separating  Transformation 

The  number  of  feature  variables  is  reduced  from  twenty  (X)  to  three 
(X1)  by  determining  the  three  independent  linear  combinations  of  the  original 
twenty -feature  variables  which  are  best  at  separating  (in  the  feature  vector 
space)  the  target  points  from  the  background  points.  The  three  new  variables 
are  called  "generalized  features." 

The  linear  transformation  W relating  the  twenty  features  to  the  three 
generalized  features  can  be  written 

X'  = WX  (2.1) 

where  X1  is  (3x1),  W is  (3x20),  and  X is  (20x1). 

Equation  2.1  is  used  to  evaluate  the  target- generalized  feature  means 
X in  terms  of  the  twenty- feature  means  X . The  target  twenty -feature 
covariance  matrix  C is  related  to  the  three  generalized  feature  covariance 
matrix  c'  by 


C'  = WCWT  (2.2) 

where  the  T superscript  denotes  "transpose",  C'  is  (3x3),  W is  (3x  20),  C is 
(20x20),  and  WT  is  (20x3). 


With  the  same  W,  Equations  2.1  and  2.2  are  used  to  relate  the  background 
twenty- feature  means  to  the  background  thras-general lead- feature  Mans,  and 
the  background  twenty- feature  covariance  Mtrlx  to  the  threa-ganerallaed- 
feature  covariance  matrix,  respectively. 

A more  detailed  discussion  of  the  class-separating  transformation  is 
given  in  Appendix  C. 
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Parameterization  of  Probabilities  After  Transformation 


The  class-conditional  probability  density  functions  for  both  target 
arrays  and  background  arrays  are  parameterized  in  the  three  best  class  - 
separating  generalized  feature  space.  The  parameters  for  targets  are  the 
three-component  generalized  feature  vector  means  X1  and  the  three-by- three 
generalized  feature  covariance  matrix  C1  for  target  arrays.  The  same  is 
true  for  background  arrays.  Thus,  the  probability  densities  for  target 
arrays,  f'(X'|  T),  and  for  background  arrays,  f(X'|  B),  for  an  array  with 
generalized  feature  vector  X1  can  be  evaluated.  These  densities,  together 
with  a priori  information  about  the  relative  abundance  of  target  and  back- 
ground arrays,  can  be  used  to  compute  the  probability  that  an  array  of  unknown 
classification  with  generalized  features  X1  is  a target  array,  P(T|x'),  or 
a background  array,  P(B|x'). 

A more  detailed  discussion  of  the  parameterization  of  the  probability 
density  functions  in  the  generalized  feature  space  is  given  in  Appendix  D. 

This  completes  the  discussion  of  the  training  of  the  automatic  classi- 
fier illustrated  in  Figure  2-5. 

2.4  STEP  4:  AUTOMATIC  CLASSIFICATION 

The  fourth  and  final  stage  of  the  automatic  classification  process 
deals  with  the  actual  automatic  classification  of  arrays,  as  illustrated 
in  Figure  2-6.  At  this  point  in  the  process,  the  automatic  classifier  has 
been  trained.  For  each  array,  the  probabilities  of  its  being  a target 
array  and  a background  array  are  evaluated  as  a function  of  the  array's 
generalized  feature  vector  X1  . The  array  is  automatically  classified  as 
that  class  which  has  the  greater  probability.  This  automatic  classification 
of  the  array  is  then  compared  with  the  previously  determined  classification 
based  on  event  ground  truth.  From  the  collection  of  arrays,  the  percentages 
of  correct  and  incorrect  classifications  are  presented  as  a confusion  matrix. 
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This  yields  the  probability  of  detection  and  the  probability  of  false  alarm. 
The  following  discussion  describes  the  automatic  classification  in  greater 
detail  . 


Evaluate  Array  Target  and  Background  Probabilities 

The  arrays  to  be  automatically  classified  are  input  one*at-a-time  to 

the  trained  classifier.  For  each  array,  the  twenty  features  are  evaluated, 

yielding  a twenty -vector  X for  that  array.  Using  Equation  2.1,  a generalized 

— *1 

feature  three-component  vector  X is  calculated.  Finally,  the  class- 
conditional  probability  densities  for  targets,  £(X'|T),  and  for  background, 
£(X'|  B)  are  evaluated  for  the  array. 

These  densities  are  used  to  calculate  the  Bayes  a posteriori  (i.e., 

—♦|  . — ♦ _ 

af ter  the  measurement  of  X ) probabilities  P(T|X  ),  the  probability  that  an 
array  with  generalized  feature  vector  X1  is  a target  array,  and  P(B|x'),  the 
probability  that  the  array  is  a background  array.  Specifically, 

_ q(T)  ?(X'|T) 

P (Ti  X 1 ) = — — (2.3) 

q (T  ) f(im)  + q (B  ) f(X'|  B) 

_ q(B)  f (X'l  B) 

P(B|X’)=  — — - (2.4) 

q(T)  f (X'l  T)  + q(B)  f(X'|  B) 

where  q(T)  and  q(B)  are  the  a priori  (i.e.,  before  the  measurement  X*) 
probabilities  that  the  array  is  a target  array  and  a background,  respectively. 
If  P(t|x')>  .50,  the  array  is  automatically  classified  as  a target  array, 
if  P(lilx’  )>  .50,  it  is  automatically  classified  as  a background  array. 
Equations  (2.3)  and  (2.4)  have  the  property  that 

P(Tlx')  + P(Bl X1 ) = 1 (2.5) 
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Automatic  C lassif teat ion  vs  Ground  Truth 


A comparison  is  made  for  each  array  between  its  classification  from 
ground  truth  and  its  classification  from  the  automatic  classifier.  A confusion 
matrix  is  constructed  which  lists  the  number  of  ground  truth  target  arrays 
which  are  automatically  classified  as  target  arrays  and  background  arrays. 

This  is  also  done  for  the  ground  truth  background  arrays.  From  this  matrix 
the  probability  of  target  detection  and  the  probability  of  a false  alarm  can 
be  calculated. 


i 
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J.O  RESULTS  OF  AUTOMATIC  CLASSIFICATION 

SAR  imagery  taken  from  Gallant  Hand  and  Reforger  data  was  first  processed 
to  obtain  a set  of  change  images,  each  containing  military  targets.  The  pro- 
cessing was  done  with  CDC  computer  programs  which  simulate  the  Modular  Change 
Detector  (MCD)  processing  required  to  create  a difference  image.  Figures  3-1 
through  3-9  show  the  results  of  this  processing.  In  each  case,  except  Figure 
3-1,  the  reference,  mission,  warped  mission,  and  difference  images  are  pre- 
sented. The  target  changes  are  shown  as  white  in  Figure  3-1  and  black  on  the 
other  images. 

The  next  stage  of  the  processing  is  performed  on  each  difference  image 
separately.  First,  a pixel  intensity  threshold  is  imposed  on  the  difference 
image  to  create  mission  change  events.  Then  an  event  size  threshold  is  imposed 
so  that  only  "large"  events  remain.  The  intensity  and  size  thresholds  are 
determined  either  by  the  user  (supervised  thresholding)  or  automatically 
(adaptive  thresholding).  At  this  point  the  change  data  correspond  to  the  MCD 
output . 

The  resulting  change  event  images  are  presented  in  Figures  3-9  through 
3-17  (events  are  white).  The  caption  of  each  image  includes  the  thresholding 
procedure  and  values  used.  The  target  clusters  are  indicated  by  white  hand- 
drawn  boundaries.  The  other  events  are  those  changes  which  have  passed  the 
change  criteria  at  this  point  in  the  processing.  These  can  be  either  valid 
but  non-interesting  changes,  or  false  changes  induced  by  signal  noise,  scintil- 
latio.n,  etc.  The  ground  truth  is  not  adequate  to  discriminate  these  background 
categories. 

Next,  the  events  in  each  image  are  clustered  into  arrays  using  either 
Iterative  Clustering  or  Minimal  Spanning  Tree  Clustering,  which  are  described 
in  Section  2.0.  The  captions  of  Figures  3-9  through  3-17  indicate  the 
clustering  method  used. 

The  next  stage  of  the  processing  (Figure  2-5)  involves  the  use  of  event 
ground  truth  to  establish  array  ground  truth.  Those  arrays  composed  of  more 
than  527.  target  events  (labeled  from  ground  truth)  are  labeled  target  arrays 
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Figure  3-1.  Gallant  Hand  (Gun  Kmn lacements)  Difference  Image. 
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Figure  3-2.  Reforger  Region  A (Helicopters). 
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Figure  3-3.  Reforger  Region  B (Hawk  Site). 
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Figure  3-5. 
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Re  forger  Region  D (Hawk  Site). 
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Figure  3-6.  Reforger  Region  F (Helicopters). 
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Figure  3-7.  Reforger  Region  G (Armor). 
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Kignre  3-9.  Gun  lump  laci-ni"tii  Kveni  Image  with  Target  Arrays  Circled. 

Th  reshn  Ids  : Su|..rv  i s.-d  si/.e  (It)  and  Intensity  (24). 

C luster i ng : lie  ral i ve . 
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Figure  3-10.  Region  A (Helicopters)  Event  Image  (Enlarged)  with 
Target  Arrays  Circled. 

Thresholds:  Supervised  Size  (13)  and  Intensity  ( 2 A ) 

Clustering:  Minimal  Spanning  Tree 
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Clustering:  Minimal  Spanning  Tree 


Figure  3-12.  Region  A (Helicopters)  Event  Image  (Enlarged)  with 
Target  Arrays  Circled. 

Thresholds:  Adaptive  Size  (2)  and  Intensity  (17) 

Clustering:  Minimal  Spanning  Tree 
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Figure  3-13.  Region  B (Hawk  Site)  Event  Image  (Enlarged)  with  TargeL 
Arrays  Circled. 

Thresholds:  Adaptive  Size  (2)  and  Intensity  (19). 
Clustering:  Minimal  Spanning  Tree. 
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Figure  '3— 14.  Region  I)  (Hawk  Site)  Event  Image  (Enlarged)  with 
Target  Arrays  Circled. 

Thresholds:  Adaptive  Size  (2)  and  Intensity  (??>. 

Clustering:  Minimal  Spanning  Tree. 


Figure  3-16*  Region  G (Amor)  Event  Image  (Enlarged’*  with  Target  Arrays  Circ 
Thresholds:  Adaptive  Size  (2)  and  Intensity  (19) 

Clustering:  Minimal  Spanning  Tree 


. 

i Figure  3-17.  Region  N (Tanks)  Event  Image  (Enlarged)  with 

Target  Arrays  Circled. 

Thresholds:  Adaptive  Size  (2)  and  Intensity  (17). 
Clustering:  Minimal  Spanning  Tree. 
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and  are  circled  in  Figures  3-9  through  3-17 


Comparison  of  Figures  3-10  and  3-12  shows  the  differences  which  result 
when  different  thresholds  are  used  on  the  same  difference  image.  A single 
large  target  array  in  Figure  3-10  is  replaced  by  three  target  arrays  in  Figure 
3-12.  The  target  array  shown  by  arrows  In  Figure  3-10,  is  absent  from  Figure 
3-12  because  in  the  latter  the  array  consisted  of  only  507.  target  events. 

(An  array  must  consist  of  at  least  527.  target  events  to  be  labeled  a target 
array  and  at  least  527.  background  events  to  be  labeled  a background  array.) 

Therefore,  this  array  is  not  labeled  and  is  removed  from  further  consideration 
in  classifier  training  and  in  automatic  classification. 

The  target  array  shown  by  arrows  in  Figure  3-12  is  missing  in  Figure 
3-10  because  of  a modification  in  the  event  ground  truth.  Ground  truth  for 
Reforger  Region  A indicates  the  presence  of  helicopters,  although  which  par- 
ticular events  are  the  helicopters  is  not  specified.  Therefore,  all  events 
which,  on  the  basis  of  size  and  spacing  from  other  events,  could  reasonably  be 
designated  helicopters  were  "identified"  as  such.  The  result  of  this  event 
labeling  (after  clustering)  is  shown  in  Figure  3-10  where  the  target  arrays 
are  circled.  At  a later  date,  the  decision  was  made  to  consider  only  the  two 
classes,  "military  equipment"  arrays  and  "background"  arrays,  because  the  small 
number  of  tank  arrays,  helicopter  arrays,  etc.  precluded  their  being  used  as 
separate  classes  in  the  training  of  the  classifier.  The  events  comprising 
the  array  circled  in  Figure  3-12  but  not  in  Figure  3-10  were  believed  to  be 
of  a military  nature,  even  though  the  events  did  not  appear  to  be  helicopters. 

Thus,  these  events  were  labeled  as  target  events  so  that  the  array  in  Figure 
3-12  is  labeled  a target  array. 

The  following  material  describes  the  three  different  classifier  training 
and  automatic  classification  experiments  which  were  performed  under  this  con- 
tract. Each  experiment  consists  of  a selection  of  Images  from  among  those  of 
Figures  3-9  through  3-17.  As  illustrated  in  Figure  2-5,  each  array  in  a 
selected  image  is  labeled  as  a target  array  or  a background  array  on  the  basis 
of  event  ground  truth.  All  of  the  target  arrays  in  the  selected  Images  are 

i 
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used  to  evaluate  the  parameters  in  the  target  class-conditional  probability 
density  function,  i.e.,  to  train  the  classifier.  The  background  arrays  are 
used  similarly.  The  trained  classifier  is  then  used  to  automatically  classify 
each  array  as  a target  array  or  a background  array.  Comparison  is  then  made 
between  the  classifications  as  given  by  the  automatic  classifier  and  by  the 
ground  truth  assignment.  These  comparisons  are  summarized  for  the  experiment 
as  a confusion  matrix  which  presents  the  number  of  target  arrays  classified 
as  target  arrays  (a  quantity  related  to  the  probability  of  detection),  the 
number  of  target  arrays  classified  as  background  arrays  (a  quantity  related 
to  the  false  alarm  rate),  and  the  corresponding  quantities  calculated  for  the 
background  arrays. 

| 

I 

3.1  FIRST  EXPERIMENT 

In  this  experiment,  only  the  gun  emplacement  data  from  Gallant  Hand, 
shown  in  Figure  3-9,  was  used.  Of  the  61  arrays  obtained  by  clustering  the 
events,  eight  were  labeled  as  target  arrays,  52  as  background  arrays,  and  one 
as  neither  (since  it  was  composed  of  507.  target  events  and  507.  background 
events).  The  eight  target  arrays  are  circled  in  Figure  3-9. 

I 

To  train  the  classifier  as  indicated  in  Figure  2-5,  20  features  were 
evaluated  for  each  of  the  target  arrays  (or  gun  emplacement  clusters).  These 
eight  points  in  the  20-dlmensional  feature  space  were  used  to  evaluate  the 
parameters  (the  20-feature  means  and  covariance  matrix  for  the  target  arrays) 
in  the  Gaus8lan-modeled  gun  emplacement  class-conditional  probability  density 
f (X/T) , Next,  the  same  20  features  were  evaluated  for  each  of  the  52  back- 
ground (on  the  basis  of  event  ground  truth)  arrays.  These  52  polntB  in  the 
20-dimensional  feature  space  were  used  to  evaluate  the  parameters  (the  20 
feature  means  and  covariance  metrlx  for  the  background  arrays)  in  the 
Gaussian-modeled  background  class-conditional  probability  density  f(x/B). 

( 

Next,  as  shown  in  Figure  2-6,  the  number  of  feature  variables  was  reduced 

— ♦ -4 

from  20  (X)  to  three  (X')  by  determining  the  three  Independent  linear  combina- 
tions of  the  original  20  feature  variables  which  are  bast  at  separating  in 
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feature  space  the  target  points  from  the  background  points.  The  three  new 
variables  are  called  "generalized  features". 

The  Linear  transformation  W relating  the  20  features  to  the  three 
generalized  features  can  be  written  as  in  Equation  2.1.  The  20  feature  target 
and  background  means  and  covariance  matrices  are  related  to  the  three  general- 
ized feature  target  and  background  means  and  covariance  matrices  by  Equations 

2.1  and  2.2.  Thus,  the  probability  density  functions  for  target  arrays, 

A — ♦ A 

f(X'/T),  and  background  arrays,  f(X'/B),  for  this  experiment  were  evaluated. 

The  final  step  in  the  experiment  was  the  inputting  of  each  of  the  60 
arrays  to  the  trained  classifier  for  automatic  classification  as  shown  in 
Figure  2-4.  For  each  array,  the  20  features  were  evaluated,  yielding  a 20- 
vector  X for  that  array.  Using  Equation  2.1,  a generalized  feature  three- 

vector  X'  was  calculated.  Finally,  the  class-conditional  probability  densi- 

A “•  A — • 

ties  for  target,  f(X'/T),  and  for  background,  f(x'/B),  were  evaluated  for  the 

array.  These  densities  were  used  to  calculate  the  Bayes  a posteriori  (i.e., 

— » — « 

after  the  measurement  of  X')  probabilities  P(T/X'),  which  is  the  probability 

— ♦ 

that  the  array  with  generalized  features  evaluated  at  X1  is  a target  array, 
and  P(B/X 1 ) which  is  the  probability  that  the  array  is  a background  array. 
P(T/X')  and  P(B/X')  are  given  by  Equation  2.3  and  2.4,  where  the  a priori 
probabilities  q(T)  and  q(B)  are  given  by 


q(T) 

q(B) 


_8 

60 

60 


(3.1) 

(3.2) 


If  P(T/X')  >.50,  the  array  is  automatically  classified  as  a target  arrayj 
otherwise,  it  is  automatically  classified  as  a background  array.  Comparison 
is  then  made  with  the  classification  of  the  array  based  on  event  ground  truth. 
The  resulting  confusion  matrix  is  shown  in  Table  3-1. 


Since  all  target  arrays  were  correctly  identified,  the  probability  of 
detection  is  1007,.  Since  one  background  array  was  incorrectly  identified,  the 
probability  of  a false  alarm  was  1/52,  which  Is  approximately  27. 
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TABLE  3-1.  RESULTS  OF  AUTOMATIC  CLASSIFICATION  FOR  FIGURE  3-9 


3.2  SECOND  EXPERIMENT 


In  this  experiment,  the  Gallant  Hand  Gun  Emplacement  image  (Figure  3-9), 
the  Reforger  Region  A helicopter  image  (Figure  3-10),  and  the  Reforger  Region 
C beacon  image  (Figure  3-11)  were  used.  The  115  arrays  were  obtained  by 
clustering  each  of  the  three  images  separately.  Seventeen  arrays  were  labeled 
target  arrays  and  98  were  labeled  background  arrays  on  the  basis  of  the  event 
ground  truth.  The  17  target  arrays  are  circled  in  Figures  3-9,  3-10,  and  3-11. 

The  procedures  used  in  this  experiment  are  the  same  as  those  described  in 
the  first  experiment.  To  train  the  classifier  as  Indicated  in  Figure  2-5, 

20  features  were  evaluated  for  each  of  the  17  target  arrays.  The  resulting 
12  points  in  the  20-feature  space  were  used  to  evaluate  the  parameters  (the 
20- feature  means  and  covariance  matrix  for  the  target  arrays)  in  the  Gaussian- 
modeled  target  array  class-conditional  probability  density  f(X/T).  The  same 
20  features  were  evaluated  for  each  of  the  98  background  arrays.  The  result- 
ing 98  points  in  the  20-feature  space  were  used  to  evaluate  the  parameters 
(the  20- feature  means  and  covariance  matrix  for  the  background  arrays)  in  the 
Gauss ian-mode led  background  class-conditional  probability  density  f(X/B), 

Next,  as  shown  in  Figure  2-6,  the  three  generalised  feature  variables 
which  are  linear  combinations  of  the  original  20  feature  variables  and  which 
are  best  at  separating  in  feature  space  the  target  points  from  the  background 
points  were  determined  by  the  class-separating  transformation  W determined 
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in  this  experiment.  The  20- feature  target  and  background  means  and  covariance 

matrices  are  related  to  the  three  generalized  feature  target  and  background 

means  and  covariance  matrices  by  Equations  2.1  and  2.2.  Thus,  the  probability 

A — « A _ 

density  functions  for  targets,  f(X'/T),  and  background,  f(X'/B),  for  this 
experiment  were  evaluated,  so  the  classifier  was  trained. 

Finally,  each  of  the  115  arrays  was  inputted  to  the  trained  classifier 

for  automatic  classification,  as  shown  in  Figure  2-4.  For  each  array,  the  20 

— * 

features  were  evaluated,  yielding  a 20-vector  X1  for  that  array.  Using 

Equation  2.1,  a generalized  feature  vector  X'  was  determined  for  the  array. 

Then,  the  class-conditional  probability  densities  for  target  arrays,  ^(X'/T), 

A — * 

and  for  background  arrays,  f(X'/B),  were  evaluated  for  the  array.  These 
densities  were  used  to  calculate  the  Bayes  a posteriori  probabilities  P(T/X') 
and  P( B/X 1 ) that  the  array  with  generalized  feature  X'  is  a target  array  and  a 
background  array,  respectively.  P(T/X')  and  P(B/X')  are  given  by  Equations 
2.3  and  2.4,  where  the  a priori  probabilities  q(T)  and  q(B)  are  given  by 


q(T) 

17 

115 

(3.3) 

q(B) 

98 

115 

(3.4) 

The  confusion  matrix  shown  in  Table  3-2  is  the  result  of  the  comparison 
of  classification  of  the  arrays  based  on  automatic  classification  and  event 
ground  truth.  Since  all  target  arrays  were  correctly  identified,  the  proba- 
bility of  detection  was  1007..  Since  five  out  of  98  background  arrays  were 
automatically  classified  incorrectly  as  targets,  the  probability  of  false 
alarm  was  5/98,  which  is  approximately  57..  Thus,  the  addition  of  Region  A 
with  helicopter  arrays  (Figure  3-10)  and  Region  C with  beacon  arrays  (Figure 
3-11)  to  the  gun  emplacement  image  (Figure  3-9)  resulted  in  an  increase  in  the 
probability  of  false  alarm  from  27.  to  57..  However,  the  probability  of  target 
array  detection  remained  1007.. 


TABLE  3-2.  RESULTS  OF  AUTOMATIC  CLASSIFICATION  FOR  FIGURES  3-9,  3-10,  AND  3-11 


3.3  THIRD  EXPERIMENT 

In  this  experiment.  Reforger  Regions  A,  B,  D,  F,  G,  and  N,  shown  In 
Figures  3-12  through  3-17, respectively,  were  used.  Region  C was  not  used 
because  the  beacons  had  patterns  qualitatively  different  from  those  of  the 
other  regions.  The  119  arrays  were  obtained  by  clustering  each  of  these 
images  separately.  Fifteen  arrays  were  labeled  target  arrays  (Region  A has 
helicopter  arrays;  Region  B,  a hawk  site;  Region  D,  a hawk  site;  Region  F, 
helicopters;  Region  G,  an  armor  array;  and  Region  N,  a tank  array).  Thus,  the 
target  arrays  were  arrays  of  different  types  of  military  equipment.  In  addi- 
tion, there  were  104  background  arrays.  The  arrays  were  labeled  as  target  or 
background  on  the  basis  of  the  event  ground  truth.  The  15  target  arrays  are 
circled  in  Figures  3-12  through  3-17. 

The  procedures  used  in  this  experiment  were  the  same  as  those  described 
in  the  first  and  second  experiments.  To  train  the  classifier  as  indicated  in 
Figure  2-5,  20  features  were  evaluated  for  each  of  the  15  target  arrays.  These 
points  were  used  to  evaluate  the  20-feature  means  and  the  covariance  matrix 
for  the  target  arrays  so  that  the  Gausslan-modeled  target  array  class- 
conditional  probability  density  f(X/T)  was  parameterised.  The  20  features 
evaluated  for  the  104  background  arrays  were  used  to  evaluate  the  correspond- 
ing  parameters  for  the  background  array  probability  density  f(X/B). 

Next,  as  shown  in  Figure  2-6,  the  three  generalized  feature  variables. 
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which  arc*  linear  combinations  o£  Lhe  original  20  feature  variables  and  which 
are  best  at  separating  in  feature  space  the  target  points  from  the  background 
points  were  determined  by  the  class- separat ing  transformation  W determined  in 
this  experiment.  The  20- feature  target  and  background  means  and  covariance 
matrix  are  related  to  the  three  generalized  feature  target  and  background 

means  and  covariance  matrix  by  Equations  2.1  and  2.2  Thus,  the  probability 

A A 

density  functions  for  targets,  f(X  /T),  and  background,  f(X*/B),  for  this 
experiment  were  evaluated  so  that  the  classifier  was  trained. 


Finally,  each  of  the  119  arrays  was  inputted  to  the  trained  classifier 
for  automatic  classification,  as  shown  in  Figure  2-6.  For  each  array,  the  20 
features  were  evaluated,  yielding  a 20-vector  X for  that  array.  Using  Equa- 

tion  2.1,  a generalized  feature  vector  X'  is  determined  for  the  array.  Then, 

A -» 

the  class-conditional  probability  densities  for  target  arrays,  f(X'/T),  and 

A ^ 

tor  background  arrays,  f(X'/H),  were  evaluated  for  the  array.  These  densities 
were  used  to  calculate  the  Bayes  a posteriori  probabilities  P ( T / X ' ) and 

-*  jk 

F ( B / X ' ) that  the  array  with  generalized  feature  X'  is  a target  array  and  a 

— ^ —A 

background  array,  respectively.  P(T/X')  and  P(B/X')  are  given  by  Equation 
2.3  and  2.4,  where  the  a priori  probabilities  q(T)  and  q(B)  are  given  by 


q(T)  - -[ff  (3.3) 

104 

q(B)  - “ (3*6) 


The  confusion  matrix  shown  in  Table  3-3  is  the  result  of  the  comparison 
of  classification  of  the  arrays  based  on  automatic  classification  and  event 
ground  truth.  Since  all  target  arrays  were  correctly  identified,  the  proba- 
bility of  detection  was  1007..  Since  two  out  of  104  background  arrays  were 
automatically  classified  incorrectly  as  targets,  the  probability  of  false 
alarm  was  2/104,  which  is  approximately  27.. 
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TABLE  3-. 


. RESULTS  OF  AUTOMATIC  CLASSIFICATION  FOR  FIGURES  3-12  THROUGH  3-17 
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4.0  CONCLUSIONS  AND  RECOMMENDATIONS 


All  of  the  three  automatic  classification  experiments  had  100X  correct 
classification  of  the  military  equipment  in  operational  deployment.  'I'he 
percentage  of  correct  classification  of  background  arrays  ranged  from  95 "L  to 
98  . In  the  first  classification  experiment,  the  only  military  equipment 
in  the  imagery  consisted  of  gun  emplacements.  In  the  second  classification 
experiment,  the  military  equipment  consisted  of  gun  emplacements,  helicopters 
and  beacons.  In  the  third  experiment,  the  military  equipment  consisted  of 
helicopters,  hawk  sites,  armor  and  tanks. 

The  use  of  "military  equipment"  as  a class  was  necessary  because  of  the 
limi Led  number  of  arrays  of  any  particular  type  available  for  training  the 
classifier.  When  more  training  data  becomes  available,  helicopter  arrays, 
etc.,  can  be  considered  as  separate  target  classes.  Then  the  clasifier  can 
be  extended  from  the  two-class  system  (background  and  target)  to  a multi- 
class system  (background,  helicopter  arrays,  tank  arrays,  etc.).  The 
classifier  output  would  then  specify  military  equipment  as  to  target  type, 
i.e.,  helicopter  arrays  would  be  identified  as  helicopters  and  tank  arrays 
would  be  identified  as  tanks. 

The  Air  force  has  developed  the  Modular  Change  Detector  (MCD)  for  per- 
forming real-time  change  detection  using  a reference  and  mission  pair  of  radar 
images.  The  MCD  increases  operator  effectiveness  by  cueing  him  to  changes . 
Coing  further,  the  three  automatic  classification  experiments  performed  under 
tliis  contract  establish  the  feasibility  of  automatically  screening  for  certain 
classes  of  targets  consisting  of  groups  of  changes,  e.g.,  missile  sites  and 
clusters  of  tanks  or  helicopters.  Incorporating  this  screening  on  the  output 
of  the  MCD  could  cue  a user  to  particular  targets  of  interest.  Such  cueing 
discriminates  against  both  false  alarm  changes  and  legitimate  changes  which 
are  not  of  interest.  Previous  studies  have  shown  this  could  increase  the 
operator's  coverage  rate  and  his  probability  of  detection. 
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APPENDIX  A 

ITERATIVE  CLUSTERING  METHOD 

Problem 


Assign  ng  events  to  n^  clusters  in  such  a way  that  and  all 

events  assigned  to  the  same  cluster  appear  to  the  eye  to  be  associated,  and 
that  all  events  that  appear  to  the  eye  to  be  associated  by  proximity  are 
assigned  to  the  same  cluster. 

Approach 

The  first  step  is  to  define  a probability  density  for  each  cluster. 
This  is  a function  of  the  distance  from  the  cluster  center  and  the  number 
and  covariance  of  events  assigned  to  the  cluster.  The  next  step  is  to 
iteratively  pass  through  the  list  of  events  assigning  each  event  to  the 
maximum  likelihood  cluster,  determined  by  evaluating  the  conditional  proba- 
bility density  for  each  existing  cluster  at  the  location  of  the  event  being 
examined,  or  creating  a new  cluster  of  one  event. 


Upon  completion  of  each  iteration,  the  result  is  examined  to  decide 
whether  to  accept  the  solution  or  modify  the  parameters  and  perform  another 
iteration . 


Process  Details 


The  ng  events  to  be  clustered  are  each  described  by  a set  of  six  numbers: 
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where  X^,  Y^  are  the  image  coordinates  of  pixel  i in  the  event. 

It  will  be  convenient  to  refer  to  the  properties  of  the  event 
in  the  following  way: 
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Event  centroid 


= Event  covariance 


The  probability  density  function  which  describes  the  distribution 
of  events  in  a cluster  depends  on  the  distance  and  direction  from  the  cluster 
center  and  the  number  of  events,  nc>  in  the  cluster.  The  following  notation 
is  used. 


- Cluster  Centroid 
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C Cluster  Covariance 
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whin?  tin  index  j runs  over  all  events  included  in  cluster  i. 


S.  i ; defined  as  written  only  for  n^>l.  In  the  current  implementation, 

tin  preceding  definition  is  used  for  n >3.  For  n 2, 

c — c— 


It  i • thereby  assumed  that  a cluster  with  one  or  two  events  is  circularly 
symmetric  about  its  mean  with  a characteristic  dimension  of  D. 


Tlie  probability  density  is  written  as 
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The  function  h(n,s)  is  a smoother  to  prevent  the  size  of  the  function 
f(r,n)  from  changing  rapidly  from  a circularly  symmetric  function  of  dimension 
D to  an  elliptically  symmetric  function  with  dimensions  defined  by  the 
covariance  matrix  S.  The  transformation  which  diagonalizes  S leads  to 
eigenvalues  corresponding  to  the  major  and  minor  axes  of  an  ellipse  as  follows 


h (n, s)  is  defined  by  requiring  that,  for  n=3,  the  area  of  an  ellipse  with 
axes  and  equals  the  area  of  the  circle  with  diameter  D.  For  larger 
values  of  n,  the  distribution  is  allowed  to  gradually  assume  a size  depending 
only  on  the  covariance  of  events  in  the  cluster.  Thus 


T — — T 

Assignment  of  an  event  located  at  R^  = (X,  Y)  to  the  maximum  likelihood 

cluster  is  accomplished  by  evaluating  f.(R.,n  ) for  each  cluster  i,  and 

l j c 

assigning  event  j to  cluster  i if 


f (R.,n  )_>  f,  (R.,n  ) for  any  value  of  k. 

1 J C K J C 


hvent  j is  assigned  to  the  maximum  likelihood  cluster  or  begins  a new 
cluster  as  the  probability  volume  from  the  cluster  centroid  to  the  contour 
of  equal  likelihood  passing  through  R.  is  less  than  or  greater  than  a 
threshold  value  P.  Fortunately  there  is  an  analytical  form  for  this  integral. 
It  can  be  shown  Lh.it 
x 

i* 

2 71  j f(r,n)  rdr  = F(r,n), 
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where 


P(r,n)  = 1 


The  values  of  P(r,n)  at  r = 0,  and  r = ® show  that  f(r,n)  is  in  fact 
a probability  density:  P(o,n)  = 0;  P(®,n)  = 1. 

Upon  completion  of  one  iteration,  every  event  has  been  assigned  to 
some  cluster.  Subsequent  interations  are  conducted,  for  each  event,  by 
removing  the  event  from  the  cluster  to  which  it  is  assigned  (in  order  that 
the  cluster  statistics  used  to  make  the  maximum  likelihood  cluster  selection 
will  not  be  biased  in  favor  of  the  previous  decision),  and  making  a new 
dec ision . 

If  the  parameters  D and  P are  not  changed,  two  results  may  be  found. 
Either  the  iteration  cycles  are  terminated  by  detecting  that  no  further 
changes  are  introduced  by  iterating,  or  changes  always  occur.  In  the 
last  case,  it  has  been  found  that  this  usually  means  that  two  or  more  events 
arc  transferred  back  and  forth  between  the  same  pair  of  clusters.  Situations 
of  this  type  are  detected  and  treated  as  appropriate  to  prevent  the  process 
from  continuing  indefinitely. 

In  the  first  case,  when  an  iteration  produces  no  changes  in  cluster 
assignments,  a decision  is  required.  Either  the  solution  is  accepted,  or 
r • solution  is  disturbed  to  nroduce  changes.  The  disturbance  may  be  intend- 
ed either  to  produce  more  clusters  or  fewer  clusters. 
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Alternatives  available  for  disturbing  the  solution  include: 


1.  Adjustment  of  the  parameter  values  D and  P. 

2.  Selecting  and  removing  from  its  cluster  an  event  to  create  a new 
c luster. 

3.  Restarting  the  process  and  preventing  actions  which  lead  to  an  unsat- 
isfactory solution. 

Selection  of  an  appropriate  alternative  requires  an  evaluation  of  the 

current  solution.  Each  solution  (the  state  of  the  system  after  completion 

of  an  iteration)  is  characterized  by  three  numbers.  These  are,  the  number 

of  clusters,  N , the  number  of  events  moved  from  one  cluster  to  a new 
c 

cluster,  N , and  the  number  of  clusters  with  a characteristic  dimension 
m 

exceeding  a threshold  value,  . 

Before  discussing  the  role  of  these  numbers  in  selecting  an  alterna- 
tive, it  is  appropriate  to  discuss  some  additional  parameters  used  to  start 
the  clustering  process. 

There  are  two  special  values  of  D,  the  size  of  the  circularly  symmetric 
distribution  used  to  start  clusters;  its  initial  value,  and  its  maximum 
value.  Both  of  these  are  input  parameters.  Whenever  D is  set  to  a value 
larger  than  the  maximum  value,  the  symmetric  form  of  the  probability  density 
function  is  used  for  all  computations  without  regard  to  the  number  of  events 
in  a cluster. 

Hie  initial  value  of  D may  be  set  to  a value  larger  than  the  maximum 
value  to  accomplish  a swift  gross  clustering  by  distance  only.  A clustering 
solution  is  never  accepted  as  final  when  D exceeds  the  maximum  permitted 
value.  When  the  clustering  is  performed,  the  first  iteration  typically  has 
a large  value  of  D and  produces  a small  number  of  very  large  clusters. 


' 

. 


The  second  iteration  is  performed  with  D set  down  to  the  maximum  per- 
mitted value,  ar.d  probability  density  functions  are  computed  as  already  des- 
cribed. By  this  operation  the  few  large  clusters  formed  initially  fall 
apart  into  several  smaller  ones.  Subsequent  iterations  are  performed  to  re- 
move residual  order  dependence  and  to  find  a stable  solution  that  satisfies 
other  imposed  requirements. 

Other  parameters  used  to  begin  the  clustering  process  include  an 
initial  estimate  of  the  number  of  clusters  in  the  solution.  (If  this  esti- 
mate is  omitted,  a one  will  be  used.  If  provided,  the  number  of  iterations 
needed  to  reach  a satisfactory  solution  may  be  reduced.)  Also  included  are 
P,  the  probability  threshold  for  beginning  a new  cluster;  L,  a characteristic 
dimension  associated  with  the  largest  significant  activity  of  interest;  N , 

Li 

the  smallest  number  of  events  associated  with  the  largest  activity. 

At  the  end  of  each  iteration,  the  three  numbers  which  describe  the 
state  of  the  solution  are  examined.  Iterations  continue  until  the  largest 
cluster  in  the  solution  with  more  than  N events,  has  a characteristic  dimen- 
sion  smaller  than  L.  The  parameters  D and  P are  adjusted  at  each  iteration 
to  encourage  an  increase  or  a decrease  in  the  number  of  clusters  obtained. 
Decreasing  P encourages  new  clusters;  increasing  P discourages  new  clusters; 
similarly  for  D.  D is  never  raised  above  the  maximum  value  except  when  re- 
starting is  necessary.  P is  adjusted  between  the  limits  of  0.5  and  1.0. 

When  several  iterations  with  readjusted  parameters  fail  to  change  the 
number  of  clusters,  direct  action  is  taken.  If  more  clusters  are  needed,  a 
permanent  new  cluster  is  formed  by  removing  from  the  largest  cluster,  the 
event  which  has  the  largest  Mahalanobis  distance  from  the  cluster  center. 

A large  increase  in  the  number  of  clusters  which  results  from  this  action  is 
not  permitted.  If  parameter  adjustment  fails  to  recombine  some  of  them, 
the  offending  action  is  recorded  and  the  clustering  process  is  reinitiated 
with  a very  large  value  of  D.  When  the  process  returns  to  the  offending 
action,  it  is  prevented  and  another  alternative  is  forced. 


When  a solution  is  reached  in  which  all  clusters  are  of  satisfactory 


size,  and  the  number  of  clusters  in  the  solution  is  reached  through  an 
orderly  process,  a stability  test  is  performed.  In  this  situation  iterations 
are  performed  which  permit  events  to  move  freely  from  one  cluster  to  another; 
no  new  clusters  are  created  and  no  old  ones  are  destroyed.  This  is  accom- 
plished by  assigning  each  event  to  the  maximum  likelihood  cluster  and  by 
refusing  to  destroy  a cluster  by  reassigning  the  cluster  nucleus  (The  event 
nearest  the  cluster  centroid;  these  are  determined  for  each  cluster  before 
the  stability  test  begins).  Iterations  continue  until  no  reassignments  are 
needed  or  until  it  is  apparent  that  the  reassignments  are  due  to  oscillation 
of  two  or  more  events  between  two  or  more  clusters.  In  either  event  the 
system  is  then  judged  to  be  in  a relaxed  state. 
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APPENDIX  B 


FEATURES  FOR  DESCRIBING  ARRAYS 

After  the  clusters  have  been  isolated,  that  is,  individual  events  have 
been  grouped  in  an  array,  characteristics  about  these  arrays  can  be  used  to 
identify  targets.  The  most  obvious  characteristics  or  features  are  size, 
and  intensity  of  individual  events  and  combinations  of  size  and  intensity 
f events  in  an  array.  A proposed  set  of  features  is  described  which  will 
be  used  in  the  multi-variate  group-separating  procedures.  The  features 
describe  the  geometric  properties  of  a group  of  change  events  in  an  array. 


Subroutines  to  perform  the  computations  necessary  to  transform  binary 
data  describing  an  array  of  change  events  into  the  set  of  twenty  features  are 
on  hand.  The  following  measurements  are  accumulated  for  each  array. 


Nj  = Number  of  pixels  in  event  j 

r.  = Centroid  of  event  j 
J 

Cj  = Covariance  of  event  j about  the  event  centroid 
N = Number  of  events  in  the  array 

•4 

R = Geometric  centroid  of  the  array 
C = Geometric  covariance  of  the  array 
R = Mass  centroid  of  the  array 

C = Mass  covariance  of  the  array 

m 

The  quantitative  definitions  of  these  measurements  are  given  below. 


r.j  * image  coordinate  vector  to  pixel  i of  event  j 


set  of  all  pixels  in  event  J 
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Rj  = matrix  of  pixel  vectors  in  event  j 


(rij  " rj*  r2j  “ rj’  rNjj  “ rj) 
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:j  ‘ Nj  RjRj  T 


R = matrix  of  event  centroids  about  the  array  centroid 


= (rl  - R,  r2  - R,  ....  ?N  - R) 


J-L 


= matrix  of  event  centroids  about  the  mass  centroid 
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Define  three  diagonalizing  transformations  as  follows: 
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Tlie  eigenvalues  of  the  three  covariance  matrices  are  found  on  the  diagonals 
of  the  left  hand  side  and  the  eigen  vectors  which  define  the  three  principal 
directions  are; 
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Define  the  distance  from  event  i to  its  nearest  neighbor  as 


d. 

1 


Min 
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A set  of  twenty  features  which  characterize  the  array  is  computed  from 
these  measurements.  The  description  is  based  on  the  size,  shape,  arrangement, 
orientation  and  uniformity  of  events  in  the  array.  It  is  also  based  on  the 
array  size  and  shape  and  on  the  mass  size,  shape  and  displacement.  The 
following  features  are  used. 


1.  Array  Size 


Feature  1 = log^Q  N 
2.  Spacing  of  Events 


This  is  the  only  feature  which  requires  a search  of  the  data. 
N 

Define  d = — | d | 

i=l 

Feature  2 = lag^  ^ 


3. 


Regularity  of  Spacing 


(d1  - d)2 
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Feature  3 


4.  Event  size 


N 


Define  = 


_L 
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Feature  4 = 


5. 


Uniformity  of 


Feature  5 = 


Event  Size 


6.  Event  Shape 


Feature  6 


where 


7.  Uniformity  of  Event  Shape 


Feature  7 = 


8.  Uniformity  of  Event  Orientation 


Feature  8 
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where 
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Ou»  h12)j 
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and 


Hlj  " (hll*  h12)j 

is  a unit  vector  in  the  principal  direction  of  event  j. 


9.  Array  Size 


Feature  9 = log.J  / 

\V 


where  A^  is  the  first  eigenvalue  of  the  array  using  the  array  centroid. 


10.  Array  Shape 


Feature  10 


-yv\ 


11.  Orientation  of  Events  to  Array 


Feature 


ii  = J ~ 2 [i  - (urHlj)2} 


12.  Event  Area 


Define  N = - 
N 


E », 


Feature  12  = iog.^  (N) 


14.  Array  Mass  Size 
Feature  14  = log 


1C 
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where  is  the  first  eigenvalue. 


15.  Array  Mass  Shape 


Feature  15  =I^2/T 

V 1 


16.  Orientation  of  Array  Mass  to  Array 


reature  16  = ll  - 


17.  Distance  from  Mass  to  Geometric  Centroids 


Feature  17  = log1Q(|  R — R^l 


18.  Array  Pixel  Density 
Feature  18  = - log 


10 


(i  +/r1)(i  + Jr2) 


19.  Kvent  Pixel  Density 


Define  P^  = 


N . 
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( 1 + ) ( 1 + '/x^j ) 
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Feature  19  = P = — L-J  P. 

N J-l  J 
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20.  Uniformity  of  Event  Pixel  Density 


APPENDIX  C 


CLASS  - SEPARATING  TRANSFORMATION 

The  following  discussion  develops  the  "optimum"  linear  transformation  W 
of  the  array  vectors  in  feature  space  which  both  clusters  arrays  of  the  same 
class  and  separates  arrays  that  belong  to  different  classes.  The  measure  of 
clustering  or  separation  between  two  arrays  is  the  Euclidean  distance  between 
the  arrays  after  their  linear  transformation  to  the  new  feature  space. 

The  effect  of  such  a transformation  is  illustrated  in  Figure  C-l,  where 
like  arrays  of  each  class  have  been  clustered  and  the  two  clusters  have  been 
separated  from  each  other. 


Figure  C-l.  Separation  of  Classes  after  the  Linear 
Transformation  W to  New  Feature  Space. 

One  transformation  that  accomplishes  the  stated  objective  can  be  speci- 
fied as  follows:  Find  the  linear  transformation  W that  maximizes,  after 
transformation,  the  mean-square  distance  between  points  that  belong  to  differ- 
ent classes  (the  mean-square  interset  distance)  subject  to  the  constraint  that 
the  mean- square  distance  between  the  points  of  one  class  (the  mean-square 
intraset  distance  of  the  class)  is  held  constant  after  transformation. 

The  particular  linear  transformation  W that  maximizes  alter  trans- 
formation the  mean-square  lnterset  distance  while  holding  the  mean-square 
intraset  distance  of  one  class  constant  after  transformation  is  developed 
below.  The  purpose  of  the  transformation  is  to  separate  arrays  of  different 
classes  while  clustering  those  that  belong  to  the  same  class. 
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The  linear  transformation  W maps  the  N-dimensional  feature  vector  f into 


f'  = W f 
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so  that  f. 
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The  mean- square  distance  between  the  M.  members  of  the  set  {f  } and 

the  M2  members  of  the  set  [gp],  after  the  linear  transformation,  is  given  by 

Equation  C.4,  where  f and  g are,  respectively,  the  sth  components  of  the 
th  th 

m and  p arrays  of  the  sets  [f  } and  (g  }.  For  notational  simplicity, 
this  mean-square  interset  distance  is  denoted  by  S({fm},  {gp })  and  is  the 
quantity  to  be  maximized  by  a suitable  choice  of  the  linear  transformation 
W.  The  choice  of  the  notation  above  is  intended  to  signify  that  the  trans- 
formation to  be  found  is  a function  of  the  two  sets. 
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m=l  p=l  n*l 


(C.4b) 
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The  coefficient  x is  the  general  element  of  the  matrix  X which  is  of 
sr 

the  form  of  a moment  matrix  and  arises  from  considerations  of  cross-set 
distances.  The  matrix  T with  general  coefficient  t arises  from  consider- 
ations involving  distances  between  the  points  of  only  one  set,  the  set 


Equation  C.6  can  be  maximized,  subject  to  the  constraint  of  Equation  C.7 
by  the  method  of  Lagrange  multipliers.  Since  dw^g  is  arbitrary  in  Equation 
C.8,  Equation  C.9  must  be  satisfied. 
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N (C.9) 


Equation  C.9  can  be  written  in  matrix  notation  to  exhibit  the  solution 

in  an  illuminating  way.  Let  w be  a row  vector  with  N components  (w  , ... 
v n nl 

Wj^) , then  Equation  C.9  can  be  written  as 


wL  (X-XT)  = 0 
• • • • • • • 

(X-XT)  = 0 (C.lOa) 

“wN  (X-XT)  = 0 

Multiplying  both  sides  of  the  equation  from  the  right  by  T , Equation 
C.lOb,  which  is  of  the  form  of  an  eigen  value  problem,  is  obtained. 
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^ (XT_1-XI)  = 0 

"w  (XT*1 -XI)  = 0 (C.lOb) 

n 

wjj  (XT'l-XI)  = o 

T 1 exists  since  T is  positive  definite.  Equations  C.IOa  and  C.lOb  can  be 

satisfied  in  either  of  two  ways.  Either  w (the  row  vector  which  is  the 

t hi  ^ 

n row  of  the  linear  transformation  W given  by  Equation  C.2)  is  identically 

zero,  or  it  is  an  eigenvector  of  the  matrix  XT.  ^ In  order  to  find  the 

solution  that  maximizes  S,  a substitution  in  the  mean-square  interset  distance 

given  by  Equation  C.6a  must  be  made.  Note  first  that  through  matrix  notation, 

Equations  C.6a  and  C.7a  can  be  written  as  Equations  C.ll  and  C.12. 


N 

"”nX«nT 

n=l 

e=y  Tw*T=K 

Li  n n 

n=l  . 


(C.ll) 


(C.12) 


But  from  Equation  C.IOa,  w^X  can  always  be  replaced  by  Xw^T  . Thus 
Equation  C.ll  can  be  written  as  Equation  C.13,  where  the  constraint  of 
Equation  C.12  is  used. 

N 

S({?  U$^})  = X £ ‘wiTlfT  = XK  (C.13) 

n=l 


Thus  the  largest  eigenvalue  of  (X-XT)  = 0 determines  the  transformation 

that  maximizes  the  mean- square  interset  distance,  subject  to  the  constraint 

that  the  mean-square  intraset  distance  in  the  set  [f^}  is  a constant.  The 

transformation  is  given  by  Equation  C.14,  where  w.  “ (w, , , w, „ , w,,t) 

1 11  1/  IN 

is  the  eigenvector  corresponding  to  the  largest  eigenvalue,  X. . and  the 
other  w^  of  Equations  C.IOa  and  C.lOb  are  identically  zero. 
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w = 


0 

\° 


0 

0 


(C . 14) 


The  transformation  of  this  equation  is  singular,  which  expresses  the 

fact  that  the  projection  of  a point  in  feature  space  onto  the  line  of 

maximum  mean- square  inter set  distance  and  constant  mean- square  intraset 

distance  for  set  ft  ] is  the  single  most  important  feature  determining  class 
m 


membership.  This  is  illustrated  in  Figure  C-2,  where  the  line  aa'  is  in  the 
direction  of  the  eigenvector  with  the  largest  eigenvalue  of  the  matrix  XT.  ^ 
The  point  p represents  an  array  of  unknown  classification  with  known  values 
of  Feature  1 and  Feature  2.  The  point's  projection  onto  line  aa'  is  the 
single  best  class-separating  feature.  The  point  "J?  is  classified  as  class 
[g  } because  the  mean-square  difference  between  its  projection  on  line  aa' 

P _4  * 

and  the  projection  of  points  belonging  to  set  (g  },  S(p,[g  }) , is  less  than 

P P ^ 

S(p,ff  I),  the  corresponding  difference  with  members  of  set  ff  ]. 

m ^ m J 


For  the  twenty  feature  problems  considered  in  the  contract,  the  three 
best  class-separating  features  (or  projections)  were  used.  This  transfor- 
mation is  given  by  Equation  C.15,  where  "w^  ” (w, , , w, _ , ...,  w,„) , 

1 i 1 L Z IN 

w2y  (w21-  w22>  ••’>  W2N^’  and  W3  ~ (w3i > w32’  •••»  W3N)  are  the  eigenvectors 
corresponding  to  the  three  largest  eigenvalues  an(^  X-j)  > ancl  where  the 

wn  of  Equations  C.lOa  and  C . 10b  are  identically  zero. 


(C.15) 


Thus,  for  each  array,  the  three  best  generalized  features  rare  calcu- 
lated from  the  original  twenty  features  f using  Equations  C.15  and  C.l. 
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^ 4:  I 


Figure  C-2.  Singular  Class-Separating  Transformation 


In  order  to  determine  the  transformation  W in  Equation  C.I5,  the  eigen- 
vectors w and  eigenvalues  X of  XT  ^ , as  indicated  in  Equation  C.lOb,  must 
be  obtained.  The  elements  of  the  matrices  X and  T are  given  by  Equations 
C . 6b  and  C . 7b , respectively.  The  following  equations  indicate  how  these 
matrices  can  be  expressed  in  terms  of  various  moment  matrices  of  the  two 

classes  ff  ] and  fg  (. 

1 m 1 p ‘ 


From  Equation  C . 6b , 


M1  M2 


= X 


sr  rs  M ,M0  „ . , 

1 2 m*l  p=l 


E El  <fm,  - gp.)  <£„  - 8pr)  (C.16) 
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Define 


M1  M2 


8s  M 


1M2  E C 8Ps 


(C.17) 


m=l  p=l 


8s  = M 


k E 


ps 


(C . 18) 


p=l 


Rewrite  the  argument  of  Equation  C.16  as 


(fms-8ps)(fmr-8pr)  = (^fms'8s:i-[8ps-8s'1)(tfmr-8r'!-^pr'8r])  (C>19) 


Using  Equations  C.19  and  C.17,  Equation  C.16  becomes 


Xrs  = ( C£s-8s  > ^8s‘8s  ^ ( ^ r'8r  E Vgr  ]) 


Xrs  = [fs-8sUfr-gr]-[fs-¥s3[gr-rr] 


t8s'8s^fr-8r^^s'8s^8r-gr] 


(C . 20) 
(C .21) 


The  second  term  of  Equation  C.21  can  be  evaluated. 


(C . 22) 


since 


[gr-Ir>  gr-gr  = 0 


(C . 23) 


The  same  result  is  obtained  for  the  third  term  of  Equation  C.21. 
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] 


Thus  Equation  C.21  becomes 
Xsr  = Cfs-8sKfr-gr] 

(C  .24) 

+ 

Equation  C.24  indicates  that  X is  the  sum  of  the  covariance  of  class  f 

sr 

about  the  mean  of  class  g and  the  covariance  of  class  g about  the  mean  of 
class  g.  Using  matrix  notation,  Equation  C.24  can  be  written. 

X = F (g)  + G(g)  (C . 25) 

where  F (g)  is  the  twenty- feature  covariance  matrix  of  class  f about  the 
twenty- feature  mean,  g\  of  class  g,  and  G(g)  is  the  covariance  of  class  g 
about  the  mean  of  class  g.  If,  instead  of  Equation  C.19,  the  following 
substitution  is  made 

(f  -g  )(f  -g  ) =([f  -f  ]-[g  -T  ])([f  -T  ]-[g  -T  ])  (c . 26) 

ms  °ps  mr  pr  ms  s J LOps  s J L mr  r J LOpr  rJ  v 

Then  Equation  C.25  becomes 


X = F ( f ) + C(f ) 


(C  .27) 


The  matrix  T can  be  evaluated  from  Equation  C.7b. 


Ml  Ml 


t = t =777 
sr  rs  (M 


l-l)M1  C II 

m=l  p=l 


(f  -f  )(f  -f  ) 
ms  ps  mr  pr 


(C  .28) 


Rewrite  the  argument  of  Equation  C.28  as 


(£  -f  )(f  -f  ) = ([f  -F  ]-[f  -T  ])([f  -f  l-[f  -T  1)  (C . 29) 

ms  ps7  mr  pr  u ms  s J L ps  s J v L mr  r J L pr  rJ  v 7 
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l 

< 


4 


Using  Equations  C.29  and  C.17,  Equation  C.28  becomes 


l 

sr 


"■  Cc£s-rsitfr-tri-[£s-rai[fr-ri 

(M  - 1 ) 

K£t-Ttl  } 


Using  Equation  C.23,  Equation  C.30  becomes 


t 

sr 


T 


0^-1) 


(Mrl) 


2 [f  -f  1[f  -f  ] 
L s s L r r J 

2 F (f) 


Thus,  using  Equations  C.32  and  C.27, 


XT-1  a [F  (f ) + G(f)]  F 'l(f) 
= I + G(f)  F_1(T) 


(C.30) 


(C . 31) 

(C.32) 


(C . 33) 
(C . 34) 


Thus  the  eigenvectors  w and  eigenvalues \ of  XT  can  be  obtained  from 
the  eigenvectors  and  eigenvalues  of  G(f)  F ^(f).  G(f)  is  the  covariance 
matrix  in  the  twenty- feature  space  of  the  arrays  of  class  g calculated 
about  the  mean  of  the  arrays  of  class  f.  F ^(f)  is  the  inverse  of  the 
covariance  matrix  of  the  arrays  of  class  f calculated  about  the  mean  of 
the  arrays  of  class  f. 


* 


i 
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• 
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•4  A'  W* 


Throughout  this  section  the  class-separating  transformations  were  developed 

by  reference  to  the  existence  of  two  sets,  | f } and  | g t . The  results 

1 m ’ 1 p 

obtained  by  these  methods  are  more  general,  however,  because  they  apply 
directly  to  the  separation  of  an  arbitrary  number  of  sets.  For  example,  in 
the  maximization  of  the  mean-square  interset  distance,  there  is  no  reason  why 
the  matrix  X should  involve  interset  distances  between  only  two  sets.  An 
arbitrary  number  of  sets  may  be  involved,  and  the  interset  distances  are 
simply  all  those  distances  measured  between  two  points  not  in  the  same  set. 
Similar  arguments  are  valid  for  all  the  other  matrices  involved.  The  only 
precaution  that  must  be  taken  concerns  the  possible  use  of  additional  con- 
straints specifying  preferential  or  nonpref erential  treatment  of  classes. 

These  additional  constraints  may  require,  for  instance,  that  the  mean  square 
intraset  distance  of  all  sets  be  equal  or  be  related  to  each  other  by  constants. 
Aside  from  these  minor  considerations,  the  results  apply  to  the  separation  of 
any  number  of  classes. 


The  eigenvectors  of  G(f)  F ^(f)  are  needed  to  obtain  the  transformation 
to  the  three  best  class-separating  generalized  features  in  the  twenty- feature 
space.  The  eigenvectors  are  obtained  as  follows. 

Theorem:  If  G(f)  and  F(f)  are  symmetric,  positive  definite  matrices, 

then  there  exists  a transformation  W such  that 


(a)  WF(f)  WT  = I, 


(C.35) 


where  I is  the  identity  matrix; 


(b)  WG (f ) WT  = D 


(C.36) 


where  D is  a diagonal  matrix 


and 


(c)  W(G (f ) F_1(f ) ) W"1  = D 


C-ll 


(C.37) 


Since  Equation  C.37  is  a similarity  transform  of  the  matrix  GF  \ D (which 
is  the  same  diagonal  matrix  as  in  (b))  has  diagonal  elements  which  are  the 
eigenvalues  of  GF  ^ and  the  rows  of  W are  the  eigenvectors  of  GF  VJ  is 
constructed  as  follows:  Let  H be  the  similarity  transform  for  F. 


HF(f ) H"1  = D. 


(C.38) 


where  is  the  diagonal  matrix  with  diagonal  elements  equal  to  the  eigenvalues 
of  F.  Since  F is  symmetric 


-1  T 
H = H 


(C.39) 


Thus 


HF(f)  HT  = D, 


(C.40) 


Since  the  eigenvalues  of  F are  positive  definite, 


HF (f ) HTD^”^  = I 


(C.41) 


Apply  this  same  operator  to  G to  define  G, 


V~'2  HG(f ) HT  D s G(f ) 


Let  M be  the  similarity  transform  for  G . 


(C.42) 

(C.43) 


Since  G is  symmetric 


-1  T 

M = M 


(C.44) 


Thus 


MG  (f ) 11T  = D 


(C.45) 
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where  the  elements  of  the  diagonal  matrix  D are  the  eigenvalues  of  G . The 
matrix  W which  has  the  properties  given  in  Equations  C.35,  C.36,  and  C.37 
is  constructed  as 


W = mdl  2 H 


(C.46) 


To  show  that  Equation  C.35  is  true. 


WF(f)  WT  = MDj"^  HF(f ) H1  MT 


(C.47) 


Using  Equations  C.41  and  C.44 


WF(f)  WT  = I 


(C.48) 


Thus  Equation  C.35  is  true  when  W is  given  by  Equation  C.46. 


To  show  that  Equation  C.36  is  true. 


WG  (f ) WT  = MD^  HG  ("f ) HT  MT 


(C.49) 


Using  Equations  C.42  and  C.43 


WG(f ) WT  = D 


(C.50) 


Thus  Equation  C.36  is  true  when  W is  given  by  Equation  C.46. 

To  show  that  Equation  C.37  is  true,  multiply  Equation  C.35  and  C.36 

together. 


WG  (f ) WT  WF  (f ) WT  = D 


(C.51) 


From  Equation  C.35 


_ T .1 
F(f)  W = W 


(C. 52) 
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F(f)  = U (W  ) = (W  W) 


(C. 53) 


F~ 1 ( f ) = WTW  (C.54) 

Using  Equation  C.52  in  Equation  C.51, 

WC(7)  WT  W W-1  = D (C.55) 

Using  Equation  C.52  in  Equation  C.53, 

WC(7)  F"1  (7)  W_1  = D (C.56) 

Therefore  Equation  C.37  is  true.  Since  Equation  C.56  is  a similarity 
transform  of  the  matrix  GF  \ the  diagonal  elements  of  D are  the  eigenvalues 
of  GF  \ and  the  rows  of  W are  the  eigenvectors  of  GF  These  eigenvectors 
of  GF  ^ are  not  orthonormal,  since 

WT  t W'1  (C.57) 

so  that 

WW1  ^ I (C. 58) 

Equation  C.57  is  obtained  from  Equation  C.46,  C.44,  and  C.39,  where 
is  the  diagonal  matrix  with  diagonal  elements  equal  to  the  eigenvalues  of  F, 
as  given  by  Equation  C.38. 

In  summary,  W is  the  transformation  matrix  which  maps  a twenty-feature 
vector  of  an  array  into  the  generalized  feature  space,  the  best  three  of  which 
wiLl  be  used  to  classify  the  array.  W,  constructed  as  in  Equation  C.46,  has 
the  properties  given  by  Equations  C.35,  C.36,  and  C.37. 
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APPENDIX  D 


PARAMETERIZATION  OF  PROBABILITY  DENSITIES 

The  class-conditional  probability  densities  for  the  arrays  in  the  two 
classes  f and  g are  modeled  as  multi-variate  normal  distributions  in  the  n- 
dimensional  feature  space,  with  n = 20.  For  the  class  f,  the  probability 
density  of  obtaining  the  twenty-feature  measurement  X is 

p(x|f)  = l—  I F~ 1 ("f )|  ^ exp  {-^(X-T)T  F-L(f)(X-f)}  (D.  1) 

(27T)n/Z 

F '(f)  is  the  inverse  of  the  twenty-feature  covariance  matrix  of  the  class  f 
about  the  twenty-feature  mean,  f,  of  class  f.  | F(7)|  is  the  determinant 
of  F(f).  (X-f)  is  the  twenty- feature  measurement  relative  to  the  mean  of 

class  f. 

For  the  class  g,  the  probability  density  of  obtaining  the  twenty- feature 
measurement  X is 

p(x|g)  = |G-lff)|*  exp  { -*(X-I)T  G-1  (I)  (X-g)  } (D.2) 

(277) 

The  class-conditional  probability  densities  in  the  best  generalized 
feature  space  are  found  as  follows.  Equation  D.2  involves  the  quadratic 
from 


(X-f)T  F~L(f)  (X-f)  = yT  F_ L(f)y  (D.3) 

where  y = (X-f)  by  definition.  From  Equation  C.l  the  transform  W takes  a 
feature  vector  from  the  twenty- feature  space  ^o  the  generalized  feature  space. 

y'  = Wy  (D.4) 


D-l 


f’ 


Thus 


I 


y = vTLy' 

yT  = (y'^cw-1)1 

The  right  side  of  Equation  D. 3 becomes 

yV1  (?)  y = (y,)T(w"1)TF'1(f)w"1y' 
= (y,)r(WF(7)WT)"1  y' 
Thus  in  the  generalized  feature  space 
F 1 ” ^ ("f  ) = (WF(f)  WT)-1 


(U.5) 

(D.6) 


(D.7) 


(D*  8) 


in  order  that  the  quadratic  forms  in  the  generalized  and  twenty- feature  spaces 
are  equal.  Using  Equation  C.35 


(D.9) 


so  that 

|F 1 ("f  )|  = 1 


Using  Equations  D.  9 and  D.  8 in  Equation  D.7 

yTF_1(7)  y = (y')T  Iy' 

Substituting  for  y 

(X-f)T  F“L(f)(X-f)  = (X'-f')T  I(X'-f') 

= (X'-f')  (X’-f') 


(D. 10) 


(D. 11) 


(D. 12) 
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Substituting  Equation  D*12  and  D»10  into  Equation  D.l, 


r 


P(X’lf)  = ^-75-  exp  { -^(X'-f')T  (X'-f')}  (D.13) 

(2 ff)n/Z 

The  determination  of  Equation  D.2  in  the  generalized  feature  space 
proceed?,  similarly.  As  in  Equation  D.8 

G,_1(I)  = (WG (g)  WT)_1  (D- 14) 

However,  Equation  D.14  involves  G(g),  rather  than  G(f),  which  is  used  in 
Equation  C.36.  However 

GSr^  = ^8s“fS^8r"fr^ 

= ([gg-gg^l  - rfs-gg])([gr-gr]  -Cfr-grl) 

[gg-ggJ  ^8r“gr^  - tgS"8S^  ^r”8r^ 

-ITs-is]  + CVV  [W 

GSrff>  ’ °Sr®  + 5Sr®  <D'l5) 

where  Equation  C-23  has  been  used,  and  Ggr(f)  Is  the  covariance  of  the  mean 
of  class  g,g,  about  the  mean  of  class  f,f.  In  matrix  notation  Equation  D. 15 
becomes 

G(f)  = G(g)  + G(f)  (D.  16) 


Thus 

G(I)  » G (£)  - G(f) 


(C  .17) 


D-3 


so  that  Equation  D.14  becomes 


G,-1(g)  = (W[G(f)  - G (f ) ] wV1 


(D.18) 


From  Equation  C.36 


G,-1(g)  = (D-S)"1  (D. 19) 

whore  S is  defined  to  be 

S = WG(f)  wt  (D. 20) 


and  where,  from  Equation  C-37,  D is  a diagonal  matrix  with  diagonal  elements 
equal  to  the  eigenvalues  of  G(f)  F~  (f). 

Thus  using  Equation  D. 19  in  Equation  D. 2, 

P(X'|g)  = — ^—-r0  | (D-S)"1!  * exp  r-^(X,-g,)'1’(D-S)_1(X,-g')}  . (D.21) 

(2ir)n/z  1 

Thus  Equations  D. 13  and  D.21  are  the  class-conditional  probability 
density  functions  for  classes  f and  d,  respectively. 

Equation  D. 13  gives  the  probability  density  P(X'|f)  that  the  generalized 
feature  vector  X1  will  be  obtained  for  an  array  in  class  f.  Equation  D.21  gives 
the  probability  density  P(X'|  g)  that  X 1 will  be  obtained  for  an  array  in 
class  g.  However,  the  problem  to  be  addressed  in  automatic  classification  is: 
given  that  an  array  of  unknown  classification  has  features  locating  a vector 
X1  in  generalized  feature  space,  what  is  the  probability  P (fjx  ' ) that  the 
array  belongs  to  class  f and  the  probability  P(glx')  that  it  belongs  to  class 
g?  Bayes's  Theorem  states  that 

P(f]x')  (D.  22) 

q(f)  P(X'|  f)  + q (g)  P(X'|  g) 


D-4 


(D. 23) 


P<8  1 X' ) 


q(g)  P(X'|  g)  + q(f)  P(X'|  f) 


q(f)  and  q (g)  are  the  a priori  probabilities  that  an  array  belongs  to  class  f 
and  class  g,  respectively.  Thus,  an  array  of  unknown  classification  has 
probability  q(f)  of  being  a class  f array.  However,  when  the  twenty  features 
X are  measured  and  (through  the  transformation  W)  the  generalized  feature  x ' 
determined  for  the  array,  this  additional  information  about  the  array  is  used 
to  calculate  the  probability  p(f  |x‘)  that  an  array  of  unknown  classification 
with  generalized  features  X ' belongs  to  class  f.  The  same  is  true  for  class  g. 

An  array  of  unknown  classification  with  generalized  features  x'  is 
classified  as  an  f-class  array  if  P(f  |x')>P(g  | X * ) . Otherwise,  it  is 
classified  as  a g-class  array. 
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